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Abstract 

The  High  Performance  Embedded  Computing  Software 
Initiative  (see  www.hpec-si.org)  is  addressing  the  mili¬ 
tary  need  to  advance  the  state  of  embedded  software  de¬ 
velopment  tools,  libraries,  and  methodologies  to  retain 
the  nation’s  military  technology  advantage  in  increas¬ 
ingly  software-based  systems.  Key  accomplishment  in¬ 
clude  completion  of  the  first  demonstration  and  the  de¬ 
velopment  of  the  Parallel  VSIPL++  standard.  Currently 
the  HPEC-SI  effort  is  on  track  towards  its  goal  of  chang¬ 
ing  the  state-of-the-practice  in  programming  DoD  HPEC 
SIP  systems. 


1  Introduction 

The  High  Performance  Embedded  Computing  Software 
Initiative  (HPEC-SI)  involves  a  partnership  of  industry, 
academia,  and  government  organizations  to  foster  soft¬ 
ware  technology  insertion  demonstrations,  to  advance 
the  development  of  existing  standards,  and  to  promote  a 
unified  computation/communication  embedded  software 
standard.  The  goal  of  the  initiative  is  software  porta¬ 
bility:  to  enable  ”write-once/run-anywhere/run-anysize” 
for  applications  of  high  performance  embedded  comput¬ 
ing  (see  [7,  4,  10,  8,  9,  18,  12]). 

This  paper  gives  a  brief  overview  of  the  HPEC-SI  pro¬ 
gram  objectives,  technical  objectives  and  program  plans. 
Detailed  progress  of  the  demonstration,  development  and 
applied  research  activities  that  are  taking  place  within  the 
HPEC-SI  can  be  found  in  the  HPEC2002[15,  20,  27], 
GOMAC2002[26,  5,  11,  21,  23],  GOMAC2003[28,  6, 
14,  17,  22],  and  other  conferences[16,  13]. 


*This  work  is  sponsored  by  the  High  Performance  Comput¬ 
ing  Modernization  Office,  under  Air  Force  Contract  F19628-00-C- 
0002.  Opinions,  interpretations,  conclusions  and  recommendations 
are  those  of  the  author  and  are  not  necessarily  endorsed  by  the  United 
States  Government. 


2  Program  Objectives 

HPEC-SI  is  organized  around  demonstrations,  standards 
development  and  applied  research.  Each  of  these  activ¬ 
ities  is  overseen  by  a  Working  Group.  The  demonstra¬ 
tions  team  Prime  contractors  with  FFRDC  or  academic 
partners  to  use  currently  defined  standards,  evaluate  their 
performance,  and  report  on  how  well  their  needs  are  be¬ 
ing  met.  The  first  demonstration  was  with  the  Common 
Imagery  Processor  (CIP)  and  successfully  showed  the 
use  of  MPI  communication  standard  ([1])  and  the  VSIPL 
computation  standard  ([2])  to  achieve  portability  (while 
preserving  performance)  across  shared  servers  and  dis¬ 
tributed  memory  embedded  systems.  The  Development 
Working  Group  is  extending  the  VSIPL  standard  to  in¬ 
clude  parallel  object-oriented  software  practices  already 
prototyped  by  the  research  community.  This  effort  is 
tightly  coupled  with  military  demonstrations,  and  pro¬ 
vides  the  next  generation  of  standards  with  direct  feed¬ 
back  from  the  military  user  base.  The  Applied  Research 
Working  Group  is  also  taking  a  longer  term  view  to  as¬ 
sess  the  potential  impact  of  a  variety  of  emerging  tech¬ 
nologies  such  as:  fault  tolerance  and  dynamic  schedul¬ 
ing,  self-optimization,  and  next  generation  high  produc¬ 
tivity  languages. 

3  Technical  Objectives 

The  HPEC-SI  program  uses  three  principal  metrics  to 
measure  the  progress  of  its  efforts: 

•  Portability  (reduction  in  lines-of-code  to  change 
port/scale  to  new  system); 

•  Productivity  (reduction  in  overall  lines-of-code); 

•  Performance  (computation  and  communication 
benchmarks). 

Traditionally,  it  has  always  been  possible  to  improve  in 
two  of  the  above  areas  while  sacrificing  the  third.  HPEC- 
SI  aims  to  improve  quantitatively  in  all  three  areas. 
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HPEC-SI  expects  to  achieve  at  least  a  3x  reduction  in 
the  number  code  changes  necessary  to  port  an  applica¬ 
tion  across  computing  platforms.  This  improvement  will 
primarily  be  achieved  through  the  use  and  enhancement 
of  open  software  standards  (MPI  and  VSIPL)  that  will 
insulate  applications  from  the  details  of  the  underlying 
hardware.  An  equivalent  reduction  in  code  changes  will 
also  be  seen  when  porting  from  one  size  of  platform  to 
another.  This  will  be  achieved  by  the  development  of  a 
unified  computation  and  computation  standard  (Parallel 
VSIPL)  which  will  allow  applications  to  be  moved  from 
a  computer  with  N  processors  to  a  computer  with  M  pro¬ 
cessors  with  minimal  code  changes. 

HPEC-SI  expects  to  achieve  a  3x  reduction  in  the  total 
number  of  lines  of  code  necessary  to  implement  an  appli¬ 
cation.  This  productivity  improvement  will  be  primarily 
be  through  the  use  of  higher  level  object  oriented  lan¬ 
guages  (e.g.  C++)  as  well  as  a  unified  computation  and 
communication  library  which  will  abstract  away  many  of 
code  intensive  details  of  writing  a  parallel  program. 

HPEC-SI  expects  to  achieve  a  1.5x  increase  in  perfor¬ 
mance  over  existing  approaches  on  some  computation 
and  communication  benchmarks.  This  is  primarily  due 
to  an  increased  level  of  abstraction  which  allows  the  in¬ 
creased  use  of  “early  binding”  in  the  application,  in  the 
library  and  in  the  compiler.  [Early  binding  is  the  pro¬ 
cess  of  building  data  structures  in  advance  that  increase 
performance  at  runtime.] 

4  Summary 

The  current  achievements  of  HPEC-SI  include  the  suc¬ 
cessful  utilization  of  the  Vector  Signal  and  Image  Pro¬ 
cessing  Library  (VSIPL)  and  the  Message  Passing  In¬ 
terface  to  demonstrate  a  tactical  synthetic  aperture  radar 
(SAR)  code  running  without  modifications  and  at  high 
performance  on  parallel  embedded,  server  and  cluster 
systems.  HPEC-SI  is  also  creating  the  first  parallel  object 
oriented  computation  standard  by  adding  these  exten¬ 
sions  to  the  VSIPL  standard.  The  parallel  VSIPL++  stan¬ 
dard  will  allow  high  performance  parallel  signal  and  im¬ 
age  processing  applications  to  take  advantage  of  the  in¬ 
creased  productivity  offered  by  object  oriented  program 
as  well  as  the  performance  advantages  found  using  ad¬ 
vanced  expression  template  technology.  The  draft  object 
oriented  specification  and  reference  code  are  both  avail¬ 
able  on  the  HPEC-SI  website  and  are  being  tested  by  a 


variety  of  early  adopters.  Finally,  HPEC-SI  is  evaluating 
advanced  software  technologies  such  as  fault  tolerance 
and  the  use  of  higher  level  languages  to  determine  which 
aspects  are  ready  for  future  standardization.  Combined, 
all  of  these  efforts  are  successfully  changing  the  state-of- 
the-practice  in  programming  DoD  HPEC  SIP  systems. 
Critical  to  this  effort  has  been  the  availability  of  a  wide 
variety  of  HPCMO  systems  (Mercury,  Sky,  SGI,  Com¬ 
paq,  IBM,  Linux,  andFPGA)  that  has  allowed  the  testing 
and  demonstration  of  advanced  software  technologies  for 
DoD  signal  and  image  processing  applications. 
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Challenge:  Transition  advanced 
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Why  Is  DoD  Concerned  with 
Embedded  Software? 


Estimated  DoD  expenditures 
for  embedded  signal  and 
image  processing  hardware 
and  software  ($B) 


*  COTS  acquisition  practices  have  shifted  the  burden  from  “point  design” 
hardware  to  “point  design”  software 

*  Software  costs  for  embedded  systems  could  be  reduced  by  one-third 
with  improved  programming  models,  methodologies,  and  standards 
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Issues  with  Current  HPEC  Development 
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gen.  2nd  gen.  3rd  gen.  4th  gen.  5th  gen.  6th  gen. 

•  High  Performance  Embedded 
Computing  pervasive  through  DoD 
applications 

-  Airborne  Radar  Insertion  program 

85%  software  rewrite  for  each  hardware 
platform 

-  Missile  common  processor 

Processor  board  costs  <  $100k 
Software  development  costs  >  $100M 

-  Torpedo  upgrade 

Two  software  re-writes  required  after  changes 
in  hardware  design 


Today  -  Embedded  Software  Is: 

*  Not  portable 

*  Not  scalable 

*  Difficult  to  develop 

*  Expensive  to  maintain 
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Application  software  has  traditionally 
been  tied  to  the  hardware 


Many  acquisition  programs  are 
developing  stove-piped  middleware 
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Open  software  standards  can  provide 
portability,  performance,  and 
productivity  benefits 
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*  Over  100  participants  from  over  20  organizations 
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•Parallel  VSIPL++  vO.1  spec  completed 
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Common  Imagery  Processor 

-  Demonstration  Overview  - 


38.5” 


Common  Imagery  Processor  (CIP)p^  B 
is  a  cross-service  component 


Sample  list  of  CIP  modes 

U-2  (ASARS-2,  SYERS) 

F/A-18  ATARS  (EO/IR/APG-73) 
LO  HAE  UAV  (EO,  SAR) 
System  Manager 


— ^  MITRE 
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MIT  Lincoln  Laboratory 

*  CIP  picture  courtesy  of  Northrop  Grumman  Corporation 
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Common  Imagery  Processor 

-  Demonstration  Overview  - 


Common  Imagery 
Processor 


Demonstrate  standards-based  platform- 
independent  CIP  processing  (ASARS-2) 
Assess  performance  of  current  COTS 
portability  standards  (MPI,  VSIPL) 
Validate  SW  development  productivity  of 
emerging  Data  Reorganization  Interface 
MITRE  and  Northrop  Grumman 


Hill 


PG 


SAR 


IF 


Shared-Memory  Servers 


Single  code  base 
optimized  for  all  high 
performance  architectures 
provides  future  flexibility 


Embedded 

Multicomputers 


Commodity  Clusters 
Massively  Parallel  Processors 


MITRE 


MIT  Lincoln  Laboratory 


AFRL  — 
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Software  Ports 


Perl-WIIHF'lCG  (liSKli 


Embedded  Multicomputers 

•  CSPI- 500MHz  PPC7410  (vendor  loan) 

•  Mercury  -  500MHz  PPC7410  (vendor  loan) 

•  Sky  -  333MHz  PPC7400  (vendor  loan) 

•  Sky  -  500MHz  PPC741 0  (vendor  loan) 

Mainstream  Servers 

•  HP/COMPAQ  ES40LP  -  833-MHz  Alpha  ev6  (CIP  hardware) 

•  HP/COMPAQ  ES40  -  500-MHz  Alpha  ev6  (CIP  hardware) 

•  SGI  Origin  2000  -  250MHz  RlOk  (CIP  hardware) 

•  SGI  Origin  3800  -  400MHz  R12k  (ARL  MSRC) 

•  IBM  1 ,3GHz  Power  4  (ARL  MSRC) 

•  Generic  LINUX  Cluster 

MITRE  MIT  Lincoln  Laboratory 
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Portability:  SLOC  Comparison 


Sequential  VSIPL  Shared  Memory  VSIPL  DRI VSIPL 

—  MITRE  MIT  Lincoln  Laboratory  AFRL  — 
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Shared  Memory  /  CIP  Server  versus 
Distributed  Memory  /  Embedded  Vendor 


Application  can  now  exploit  many  more  processors,  embedded  processors 
(3x  form  factor  advantage)  and  Linux  clusters  (3x  cost  advantage) 
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Form  Factor  Improvements 


Current  Configuration 


Possible  Configuration 


IOP 


6U 

VME 


•  IOP:  6U  VME  chassis  (9  slots  potentially 
available) 

•  IFP:  HP/COMPAQ  ES40LP 

MITRE  MIT  Lincoln 


•  IOP  could  support  2  G4  IFPs 

•  form  factor  reduction  (x2) 

•  6U  VME  can  support  5  G4  IFPs 

•  processing  capability  increase  (x2.5) 

Laboratory  AFRL  “ 


Slide-16 

www.hpec-si.org 


HPEC-SI  Goals 
1st  Demo  Achievements 


Portability:  zero  code  changes  required 
Productivity:  DRI  code  6x  smaller  vs  MPI  (est*) 
Performance:  2x  reduced  cost  or  form  factor 


Demonstrate 


Achieved 
Goal  3x 
Portability 


J 


k 


Portability:  reduction  in  lines-of-code  to 

change  port/scale  to  new 
system 

Productivity:  reduction  in  overall  lines-of- 
code 

Performance:  computation  and 

communication  benchmarks 


A 


/  HPEC 
4  Software 


% 

o 


Initiative 


Interoperable  &  Scalable 


Performance 
Goal  1.5x 
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Achieved 
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Achieved* 
Goal  3x 
Productivity 


/ 


AFRL  — 


Outline 


*  Introduction 

*  Demonstration 


Development 


•  Applied  Research 

•  Future  Challenges 

•  Summary 


Object  Oriented  (VSIPL++) 
Parallel  (\\VSIPL++) 
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Emergence  of  Component  Standards 


HPEC  Initiative  -  Builds  on 
completed  research  and  existing 
standards  and  libraries 

MITRE  MIT  Lincoln  Laboratory 
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Definitions 

VSIPL  =  Vector,  Signal,  and  Image 
Processing  Library 

||VSIPL++  =  Parallel  Object  Oriented  VSIPL 
MPI  =  Message-passing  interface 
MPI/RT  =  MPI  real-time 
DRI  =  Data  Re-org  Interface 
CORBA  =  Common  Object  Request  Broker 
Architecture 

HP-CORBA  =  High  Performance  CORBA 

AhKL  “ 


VSIPL++  Productivity  Examples 


Ftnormance  (1,5k|i 


BLAS  zherk  Routine 


*  BLAS  =  Basic  Linear  Algebra  Subprograms 

*  Hermitian  matrix  M:  conjug(M)  =  M1 

*  zherk  performs  a  rank-k  update  of  Hermitian  matrix  C: 

C  <-  a  *  A  *  conjug(A)1  +  p  *  C 


•  VSIPL  code 

A  =  vs  ip_cmc  re  a  t  e_d  ( 1 0 , 1 5  / VS  I  P_ROW ,  MEM_NONE )  ; 

C  =  vs  ip_cmc  re  a  t  e_d  ( 1 0 , 1 0  / VS  I  P_ROW ,  MEM_NONE )  ; 
trap  =  vs  ip_cmc  re  a  t  e_d  ( 1 0 , 1 0  , VS  I  P_ROW ,  MEM_NONE )  ; 
vsip_cmprodh_d  (A,  A,  tmp)  ;  /*  A*conjug(A)t  */ 
vsip  rscmmul  d (alpha, tmp , tmp) ; /*  a*A*conjug (A) t  */ 


vsip_rscmmul_d(beta,C,C) ;  /*  P*C  */ 
vsip_cmadd_d(tmp,C,C)  ;  /*  a*A*conjug  (A)  + 

vsip_cblockdes troy  (vsip_cmdestroy_d  (tmp)  )  ; 
vsip_cblockdestroy  (vsip_cmdestroy_d  (C)  )  ; 
vsip_cblockdestroy  (vsip_cmdestroy_d  (A)  )  ; 

•  VSIPL++  code  (also  parallel) 

Matrix<complex<double>  >  A (10, 15); 
Matrix<complex<double>  >  C(10,10); 
C  =  alpha  *  prodh(A,A)  +  beta  *  C; 


p*c  */ 


Sonar  Example 

•  K-W  Beamformer 

•  Converted  C  VSIPL  to 
VSIPL++ 

•  2.5x  less  SLOCs 
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PVL  PowerPC  AltiVec  Experiments 


Results 

•  Hand  coded  loop  achieves  good 
performance,  but  is  problem 
specific  and  low  level 

•  Optimized  VSIPL  performs  well 
for  simple  expressions,  worse 
for  more  complex  expressions 

•  PETE  style  array  operators 
perform  almost  as  well  as  the 
hand-coded  loop  and  are 
general,  can  be  composed,  and 
are  high-level 


1000 


o 

J8  800 


AltiVec  loop 
VSIPL 

PETE/AltiVec 


A=B+C  A=B+C*D+E*F 

A=B+C*D  A=B+C*D+E/F 


Software  Technology 


AltiVec  loop 

VSIPL  (vendor  optimized) 

PETE  with  AltiVec 

*  C 

•  For  loop 

•  Direct  use  of  AltiVec  extensions 

*  Assumes  unit  stride 

*  Assumes  vector  alignment 

*  C 

*  AltiVec  aware  VSIPro  Core  Lite 
(www.mpi-softtech.com) 

*  No  multiply-add 

*  Cannot  assume  unit  stride 

*  Cannot  assume  vector  alignment 

*  C++ 

*  PETE  operators 

*  Indirect  use  of  AltiVec  extensions 

*  Assumes  unit  stride 

*  Assumes  vector  alignment 
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Parallel  Pipeline  Mapping 


Filter 

T  =  FIR(Xin  ) 


Beamform 

Xqut  =  W  *X|N 


Detect 

XquT  =  |X,N|>c 


Mapping 


Parallel 

Computer 


Signal  Processing  Algorithm 


•  Data  Parallel  within  stages 

*  Task/Pipeline  Parallel  across  stages 
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Scalable  Approach 


#include  <Vector.h> 

#include  <AddPvl.h> 

void  addVectors(aMap,  bMap,  cMap)  { 

Vector<  Complex<Float>  >  a(‘a’,  aMap,  LENGTH); 
Vector<  Complex<Float>  >  b(‘b’,  bMap,  LENGTH); 
Vector<  Complex<Float>  >  c(‘c’,  cMap,  LENGTH); 

b  =  1; 
c  =  2; 
a=b+c; 

} 


Single  Processor  Mapping 


Multi  Processor  Mapping 


Lincoln  Parallel  Vector  Library  (PVL) 

•  Single  processor  and  multi-processor  code  are  the  same 

•  Maps  can  be  changed  without  changing  software 

•  High  level  code  is  compact 
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Outline 


•  Introduction 

•  Demonstration 

•  Development 

•  Applied  Research  ll*>  ; 

•  Hybrid  Architectures  (see  SBR) 

•  Future  Challenges 

•  Summary 
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Dynamic  Mapping  for  Fault  Tolerance 


Failure 


X 

- i-  i  MV 

Nodes:  0,2 

_ V 

w 

Map2  Q 

X 

Nodes:  1,3 

V 

V_v 


•  Switching  processors  is  accomplished  by  switching  maps 

•  No  change  to  algorithm  required 

•  Developing  requirements  for  ||VSIPL++ 


org 
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Parallel  Specification 


Ftnormance  (1,5k|i 


Clutter  Calculation  (Linux  Cluster) 


Number  of  Processors 


%  Initialize 

p MATLAB_Init;  Ncpus=co m  m  vars.  co mm_size; 

%  Map  X  to  first  half  and  Y  to  second  half. 

mapX=  map  ([1  Ncpus/2],  {},  [1:  Ncpus/2]) 

map  Y  =  map  ( [Ncpus/2  1],  {},  [Ncpus/2+1:  Ncpus] ) ; 

%  Create  arrays. 

X  =  complex(rand(N,M,  mapX),rand(N,  M,  mapX)); 

Y  =  co mplex (zeros (N,ty  map Y)  ; 

%  initialize  coefficents 
coefs  =  ... 
weights  =  ... 

%  Parallel  filter  +  comer  turn. 

Y  ( =  conv2  (coefs,X) ; 

%  Parallel  matrix  multipLy. 

Y(:,:)  =  weight  s*Y; 

%  Finalize  pMATLAB  and  exit. 
pMATLAB_Finalize;  exit; 


•  Matlab  is  the  main  specification  language  for  signal  processing 

•  pMatlab  allows  parallel  specifciations  using  same  mapping 
constructs  being  developed  for  ||VSIPL++ 
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Outline 


*  Introduction 

*  Demonstration 

*  Development 

*  Applied  Research 

*  Future  Challenges 

*  Summary 
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Optimal  Mapping  of  Complex  Algorithms 


Application 


Different  Optimal  Maps 


Embedded 

Multi-computer 

Hardware 


•  Need  to  automate  process  of  mapping  algorithm  to  hardware 
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HPEC-SI  Future  Challenges 


Time 


End  of  5 
Year  Plan 


Phase  5 


Phase  3 


Applied  Research: 
Hybrid  Architectures 


prot 


Development: 
Fault  tolerance 


FT 


V  5 


Demonstration: 
Unified  Comp/Comm  Lib 


Unified  Comp/Comm 
Standard 

*  Demonstrate  scalability 
MITRE 
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Phase  4 


Applied  Research: 
Higher  Languages  (Java?) 


Applied  Research: 
PCA/Self-optimization  H 


otype 


Development:  Hybrid 

Hybrid  Architectures  VSIPL 


Demonstration: 
Fault  tolerance 


Demonstrate 
Fault  Tolerance 
*  Increased  reliability 


MIT  Lincoln  Laboratory 


Development: 

Self-optimization 


Demonstration: 
Hybrid  Architectures 


Portability  across 
architectures 
•  RISC/FPGA  Transparency 


AFRL  — 


Functionality 


Summary 
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HPEC-SI  Program  on  track  toward  changing  software  practice 
in  DoD  HPEC  Signal  and  Image  Processing 

-  Outside  funding  obtained  for  DoD  program  specific  activities 
(on  top  of  core  HPEC-SI  effort) 

-  1st  Demo  completed;  2nd  selected 

-  Worlds  first  parallel,  object  oriented  standard 

-  Applied  research  into  task/pipeline  parallelism;  fault  tolerance; 
parallel  specification 


Keys  to  success 

-  Program  Office  Support:  5  Year  Time  horizon  better  match  to 
DoD  program  development 

-  Quantitative  goals  for  portability,  productivity  and  performance 

-  Engineering  community  support 

■  MITRE  MIT  Lincoln  Laboratory  AFRL  — 


Web  Links 


High  Performance  Embedded  Computing  Workshop 

http://www.ll.mit.edu/HPEC 

High  Performance  Embedded  Computing  Software  Initiative 

http://www.hpec-si.org/ 

Vector,  Signal,  and  Image  Processing  Library 

http://www.vsipl.org/ 

MPI  Software  Technologies,  Inc. 
http://www.mpi-softtech.com/ 

Data  Reorganization  Initiative 
http://www.data-re.org/ 

CodeSourcery,  LLC 
http://www.codesourcerv.com/ 


MatlabMPI 


http://www.ll.mit.edu/MatlabMPI 


MITRE 
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